Discriminative modeling of context-specific amino acid substitution probabilities
نویسندگان
چکیده
2 THE DISCRIMINATIVE MODEL SPACE CONTAINS THE GENERATIVE MODEL SPACE In the following we will show that the generative model with any set of parameters is equivalent to the discriminative model with an appropriately chosen set of parameters. In other words, the discriminative model with these particular parameters predicts the same context-specific substitution probabilities P (a|Ci) as the generative model. This is very useful for initializing the parameters of the discriminative model for training, because the discriminative model tends to get trapped in local optimima with very unsatisfactory performance. Initializing the training with the parameters of the generative model was an important measure to improve the training of the discriminative model. Suppose the generative model has K context states of length l = 2d + 1 and parameters αk, pk(j, a), and wj = wcenterβ. We set the parameters of the discriminative model to the following values:
منابع مشابه
Discriminative modelling of context-specific amino acid substitution probabilities
MOTIVATION Protein sequence searching and alignment are fundamental tools of modern biology. Alignments are assessed using their similarity scores, essentially the sum of substitution matrix scores over all pairs of aligned amino acids. We previously proposed a generative probabilistic method that yields scores that take the sequence context around each aligned residue into account. This method...
متن کاملImproved pairwise alignments of proteins in the Twilight Zone using local structure predictions
MOTIVATION In recent years, advances have been made in the ability of computational methods to discriminate between homologous and non-homologous proteins in the 'twilight zone' of sequence similarity, where the percent sequence identity is a poor indicator of homology. To make these predictions more valuable to the protein modeler, they must be accompanied by accurate alignments. Pairwise sequ...
متن کاملUlla: a program for calculating environment-specific amino acid substitution tables
SUMMARY Amino acid residues are under various kinds of local environmental restraints, which influence substitution patterns. Ulla,(1) a program for calculating environment-specific substitution tables, reads protein sequence alignments and local environment annotations. The program produces a substitution table for every possible combination of environment features. Sparse data is handled usin...
متن کاملGene-Specific Substitution Profiles Describe the Types and Frequencies of Amino Acid Changes during Antibody Somatic Hypermutation
Somatic hypermutation (SHM) plays a critical role in the maturation of antibodies, optimizing recognition initiated by recombination of V(D)J genes. Previous studies have shown that the propensity to mutate is modulated by the context of surrounding nucleotides and that SHM machinery generates biased substitutions. To investigate the intrinsic mutation frequency and substitution bias of SHMs at...
متن کاملA combined empirical and mechanistic codon model.
The evolutionary selection forces acting on a protein are commonly inferred using evolutionary codon models by contrasting the rate of synonymous to nonsynonymous substitutions. Most widely used models are based on theoretical assumptions and ignore the empirical observation that distinct amino acids differ in their replacement rates. In this paper, we develop a general method that allows assim...
متن کامل